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ABSTRACT 

The interest in developmental sequences and learning 
hierarchies is growing* One approach to the study of such sequetises 
is to gather data on several variables, each of which corresponds to 
a stage^ step, or phase in the sequence and to examine the 
associations between the variables as displayed in a contingenqy 
table. If the variables are associated in ways predicted by the 
hypothesized sequence, then the data lend support to the sequence, 
Goodman's loglinear model for developmental or learning sequences is 
presented and illustrated on number concept data gathered by Brainerd 
and Fraser* Where its strong assumptions are satisfied, the model 
provides a probabilistic framework within which to: (M test the 
plausibility of an hypothesized developmental sequence or learning 
hierarchy; (2) compare several hypothesized sequences on the same 
data; (3) estimate the proportion of subjects who do not cDnform to 
the sequence; and (4) estimate the proportion of subjects- at each 
step in the sequence. (Author/RL) 
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Goodman's loglinear rac -al f : evelopmental or Isaming s^^usncc 

is presented and illus- rated cn nuniter concept data gaC^^.e-- 

Brainerd and Fraser. wTiere ::-i:.s stror r assumptions are sal^-^£^^ 

the model provides a probabilist:ic f ramework within whi<^b 

(a) test the plausibility of an "lypc : nesi^ed developiuet^-ta^ seq^£:-^e 

or learning hierarchy, (b) compare s^^veral hypothesize^^ .'^^quetices on 

the same data, (c) estimaca th- pre ortion of subjects 

conform to the sequence, aud (d) e£ situate the proportic^ti r ejects 

at each step in the sequence. 
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Lc , near An-il.3r:z_£ zz ^.iLming : .-::ir:archy 
and Develarrii^rr^JL Sequence Dzirm 

The Interest in .svslopmer Herz3nces -^n'j. learning hier:archies is growing. 
One approach to the STzud- of such seqi^ances Is to gatner data cz several vari- 
ables, each of which respc ads to a stage, step ., or phase in .:he sequence 
and to examine the a3sociat:_OTi£.~ benwee - ihe vari£:.les as displayed in a con- 
tingency table. If ^he variables z.zz ■ ited in weys predicted by the hy- 
pothesized sequence, then the data land r:>rt to the sequence. 

Our purpose is to describe and c-^i":: _ly evaluai^ . a class f 1 ^linear 
models for contingency table data whi^zn :e used study a_ jrriori^ hypo- 

theses about developraentr:il sequences rr = — 'ing hierarchies. Int£zrested 
readers can refer to earlier works b- Eis™ , Fienberg, and Koilanc (1975), 
and Fienberg (1977) for more details , on li ' .i.inear models. 

The Lczlinea:- Model 

Although the model can be es^^ei.™ 3:ny desired number of variables, 
let us assume for convenience thar rhei^z. :rr^ exactly three response variables; 
A, B^, and £; which can take on val^-ies ^ and £ respectively. The three 
response variables define a 3-way -ntmrr^y table. Each way of the table 
corresponds to one of the three v-rmabla^ Within a way, each level repre- 
sents one value which can be taken by the corresponding variable. The fre- 
quency in cell (a:>J^>c_) of the table would represent the number of observations 
scored at level on A, Jb on JB, and c_ on C^. 

The hypothesized sequence (or each sequence if there is more than one) 
is presumed to divide the contingency table cells into two sets, a set of 
inadmissible cells and a set of admissible cells. An admissible cell 
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corresponds to a pattern of scores which might be expected for someone who 
confoms to the hypothesized sequence. Each inadmissible cell represents a 
patterr. which violates the sequence • Deriving the admissible cells generated 
by a theory is itself an important and sometimes difficult step in the ex- 
plication of a theory. Davison (1979), Davison, King, Kitchener, and Parker 
(1980) Froman and Hubert (1980) and Wohlwill (1973) enumerate the admissible 
cells for various kinds of theories. 

As Goodman (1975) develops the loglinear approach, subjects are divided 

±nzo K + 1 classes. 7T designates the population proportion of subjects in 
— o 

the first class, which contains persons whose development does not conform to 
the hypothesized sequence. This first class is called the unscaleable class. 
Each of the remaining classes corresponds to one of the admissible cells. 
For k = 1, . . K, IT represents the proportion of subjects in the popula- 

tion who have advanced along the sequence to the point where they should 
th 

exhibit the k admissible score pattern. 

For members of the unscaleable class, the response variables are presumed 

to be independent. Within this class, ]I(a,jb,£) = ]l^3)lL^WlL^£) " Consequently, 

the joint probability of observing an individual from the unscaleable popula- 

th 

tion with scores (a. ,b^,jc) is jyLC5.)lL(k^Z(c,) • members of the k scaleable 

t*h 

subpopulation are all assumed to' exhibit the k admissible pattern. Conse- 

th 

quently, the joint probability of observing an individual from the k scaleable 

th 

subpopulation who exhibits pattern (a^,b^,c_) is %^ if (^9^y£) is the k admis- 
sible score pattern, and it Is 0 otherwise. In the total population the 
probability of observing pattern (a_,b^,0 is assumed to be the sum of joint 
probabilities. That is, the probability of observing pattern (£,b^,0 can be 
obtained by summing the joint probability of observing (a_,b^,£) in each of 
the (K + 1) subpopulations. 
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This leads to the fundamental equation of the model: 

tL(A,B,C) = 1^JL(A)il(B)tl(jC) if (A,.B,C) is inadmissible (1) 

TF , 4- TT Tr(A)7r(B)Tr(C) if (A,B,C) is admissible. 
— k -o— — — 

There are several algorithms for fitting the model of Equazion.l (Goodman, 
1975; Davison, 1980; Bishop, Fienberg, & Holland, 1975; Fienberg, 1977) 
and several computer programs for implementing the algorithms (Davison & 
Thoma, Note 1, 1980; Dixon & Brown, 1979; Larntz, Note 2, 1974). These pro- 
grams provide estimates of expected cell frequencies under the model, Pearson 
and likelihood ratio goodness-of-f it statistics, and estimates of quantities 
from which the model parameters can be obtained. Davison (1980) shows how 
to estimate model parameters from the output of the Davison and Thoma (Note 1, 
1980) algorithm. 

Given multinomial assumptions, the Pearson and likelihood ratio statistics 
will be distributed as chi square variables under the null hypothesis repre- 
sented by Equation 1. These statistics will have N-N^-Ng--N^~K + NW - 1 
degrees of freedom. Here N is the total number of cells; N^, N^» and N^ 
are the number of levels along each way of the contingency table, NW is the 
number of ways in the table, and K is again defined as the number of admis- 
sible cells. This brings us to a limitation of the loglinear approach as 
developed in Goodman (1975). If we are not to run out of degrees of freedom, 
then K must be smaller than (N - N . - N^ - + NW 1) . The example presented 

— 'jo Li 

below will illustrate a situation in which some of the hypothesized sequences 
cannot be fitted to the data, because the loglinear model for those sequences 
requires more degrees of freedom than the data can sustain. 

Davison (1980) presents a more restricted variation of the Goodman (1975) 
model, a variation which can sometimes be applied when Goodman's unrestricted 
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formulatioT quirBf. too many rrees of freedom. Davison (198C , _ r^sas 

the constrai- - that the ratio . - jr^, / ir 7r.(a)TT_(b)£(£) ] must ecuz^. const: 

for eve-ry a:~Lssible paztern. .:::rxding to this constraint, the imi: : 

probabULity I observi:. ar. a«-_::x:llble pattern in the scaleable 

to the proba ---lity cf r^servir miat same pattern in the unscaleable s : p- 

ulation must e roug::!; the s 'rr:^ for every admissible pattern. By ' re* ly^ 

the same, we mean th Sazme er: — for the additive constant 1 contai-e: . In 

the restriction. S_ sxnntive_„ , this means that those patterns whi::i .re most 

commonly found in t." u-iscal£_iji.ie subpopulatioa are also those most :'aiuinonl 

found among those x;: r-c:nforc: .a the hypothesized sequence. While iiis co::.- 

straint is highly r:- :r±_ctive the example below illustrates data v ich sanJ.sfy 

the restricted fonn Equat:LDn 1. Other examples can be found in Davison 

(1979, 1980) and Da- .son et — . (1980). No matter how many admissible cells 

are generated by th ypothes^ed sequence, the Pearson and likelihood ratio 

fit statistics will always have (N - - - + NW - 2) degrees of freedom 

If TT , =0 for all of the admissible patterns, then the response variables 

satisfy the independence model. In that case, there is no need to postulate 

a developmental sequence to account for structure among response variables 

because the data do not suggest that such structure exists. The data fully 

support the developmental sequence model of Equation 1 only if the independence 

model can be rejected, suggesting there is some structure to the data, and 

the sequence model of Equation 1 fits the data. 

Xf ^ =0, then Equation 1 is a deterministic model in which e svy sub- 
— o 

ject*s response pattern is admissible. Or in other words, the deterministic 
form of the hypothesized developmental sequence is a limiting case of the sto- 
chastic model in Equation 1. 
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Statements of sequel les found i ihe developmental and instructional 
literatures are topical. determinisnlc, -jithout any suggestion as to how 
measurement and -r-mplir. , error should be readied. Any probabilistic sequence 
model, such as Ecz. cannot be jast straightf ornrard restatement of tri: 

deterministic secizi'^'' hypothesized -n tV J.terature, because the stochastic 
model must incox^rc augmenting assumpri^ias about error to translate the 

deterministic si::i..ami into probabilist::-.: form. If the data satisfy the 
probabilistic m -1 . 3in they lend sup: - both to the hypothesized de-- 
velopmental secui^nc- •: - the augmenring umptions. If, on the other hand, 
the probabilist:! 2 mod fails to fit fc^ uata, the failure may be because 
the developmen-L.: sequence is incorreci. zhe augmenting assumptions are not 
satisfied, or zh. The loglinear an^ sis itself does not disentangle the 
possible sourc_ of poor fit. 
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Comparisons Between Sequences 

Rather than determining whether a given sequence can be 3al : to describe 
the data, a researcher may be interested in comparing several s. luences to decide 
which best describes a set of data. Within the loglinear frana ork, there are 
two possible approaches to comparing sequences. The first arpr ach incorporates 
the restricted model. After fitting the restricted model for e^ich ^ : Juence, 
the several sequences can be compared on the basis of their ""s^rson or likelihood 
ratio fit statistics. The several fit statistics will be c rniparable^ because 
they will all correspond to models having exactly the same -.umber of degrees :f 
freedom, and all will be based on the same data. To our knowledge, there is no 
way to test the statistical significance of differences in fit for the several 
models • 

The second approach Incorporates the unrestricted form of the model in 
Equation 1. Within this approach.^ the goodness-of-f it statistics for two sequences 
can be directly compared only if the two sequences generate exactly the same 
number of admissible cells. Only then will the two fit statistics have equal 
degrees of freedom. Two sequences with unequal numbers of admissible cells cannot 
be compared directly on the basis of their fit statistics unless one sequence 
constitutes a special case of the other. 

To see how models can be compared if one is a special case of the other, 
consider tV70 sequences such that the admissible cells for Sequence I are a 
proper subset of those for Sequence II. Let the subscript m = 1, M 
designate those cells which are admissible according to Model II but not Model I. 
Given the unrestricted form of Equation 1, Model I is a special case of Model II 
in which tt =^ 0 for all m, the difference between the two likelihood ratio fit 

— jn 
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statistics for Models I znd _I G^. - G^.^., is itself approximately distributed as 
a chi square statistic vzith £i legrees of freedom under the null hypothesis 
It ^ = 0 for all m and g±:vei: rhat responses satisfy the more general model. If 
the null hypothasis canr^ot be rejected, then the more general ^fodel II cannot 
be said to sigr if icantlj^ improve the fit. Parsimony would favor Model I. 

In summary, comparisons between sequences based on goodness-of-f it statistics 
and the unrestricted version of Equation 1 would be limited to those cases in 
which the two models compared have equal degrees of freedom and those cases 
in which one model is a special case of the other. If the restricted form of 
Equation 1 is applied, any two sequences can be compared regardless of how many 
admissible cells each generates. 
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Example 

Our example is bas6id on data from Brainerd and Eraser's (1975) study of 
number development. Brainerd and Fraser scored each subject at one of three 
ordination levels and one of three cardination levels. Table 1 displays the 
frequency with which subjects were jointly scored at each level of ordination 
and cardination. Figure 1 depicts four developmental sequences which might 
be used to explain their data; reciprocal priority v/ith ordination preceding 
cardination (A), reciprocal priority with cardination preceding ordination (B) , 
unilateral priority (C) , and synchrony (D) . Hatched cells are inadmissible. 
Numbered cells are admissible. 

The unrestricted form of Equation 1 could be fitted only for sequence D. 
After estimating the row and column marginals, the d^ta in Table 1 contain 
only four remaining degrees of freedom. Sequences A, B^, and C have either 
five or six admissible cells. Consequently, the unrestricted model for these 
sequences requires at least six or seven remaining degrees of freedom. The 
restricted version of Equation 1 can be and was applied to all four sequences. 

Subject's level of ordination and cardination do not appear to be inde- 
pendent. The Pearson and likelihood ratio statistics were both statistically 
significant (X(4)^ = 17.60, G(4)^ ^ 19. 9^, £ < .01) leading us to reject the 
independence model . 

For each sequence in Figure 1, we then fitted the restricted version of Equa- 
tion 1 using the CONSCAL program of Davison and Thoma (Note 1, 1980). Equations 
11 and 13 in Davison (1980) were then used to estimate the probabilities in 
Equation 1 from the parameters printed by CONSCAL. Only sequence C, unilateral 
priority, would be rejected at any conventional significance level (X(3) - 10.82, 
G(3) = 11.19, £ < .05) irrespective of which fit measure is employed. The two. 

ERIC 
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reciprocal priority morels fit equally well (X(3) 4.26, G(3) = 3.37, £ > .05) 

and better than the synchrony model. Using a .05 level of significance, the 

2 

Pearson statistic (X(3) = 7.23) would lead to rejection of the synchrony model. 

2 

The likelihood ratio statistic would not (G = 6.19). 

For the two models which best fit the data, the reciprocal priority sequences. 

Table 2 displays the estimates of model parameters. For sequence A, the parameter 

estimates ^suggest that 51% of the subjects in the population are unscaleable; 

that is, they are not confo ming to the hypothesized sequence. Thirteen 

percent are found at step 1 in the sequence, 11% at step 2, 3% at steps 3 and 4, 

and 20% at step 5. For sequence B, parameter estimates suggest that 59% fail 

to conform, 13% are found at step 1, 1% at step 2, 3% at step 3, 4% at step 

4, and 20% at step 5. 

The parameter estimates, ir , strongly indicate that neither sequence A 

— 0 

nac can be considered a "universal" sequence, because the majority of subjects 

fail to conform to either sequence. Although the fit statistics for the two 

models are identical, A might be preferred. If A rather than is taken to be 

the sequence accounting for dependencies in Table 1, then a slightly higher 

proportion of subjects can be said to conform. Parameter estimates suggest that 

very few subjects occupy intermediate steps 3 and 4 in sequence A or steps 2 

through 4 in sequence jS. 

For any model that can be said to fit the data, the difference between the 

likelihood ratio statistic for the independence model and the developmental 

sequence model is itself approximately distributed as a chi square statistic 

» ^ n 1 

with one degree of freedom under the null hypothesis JLj^ - JLj " ~3 ^ ^ " ^ ^ 
If this conditional likelihood ratio difference statistic leads to rejection of 
the null hypothesis at some chosen significance level, then the sequence model 
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can be sai { to fit the data significantly better than the independence model. 

2 

For both models A and By the conditional likelihood ratio statistic (G (4) - 
2 2 

G(3) G(l) ^ 19.99 - 3.37 = 16.62, £ < .01) suggests that the developmental 
sequence model significantly outperforms, the independence moc-el. 
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tiscussion 

The loglinear approach to the study of sequences offers several advantages 
over alternative approaches. Unlike the Airasian and Bart (1975) and Cliff 
(1979) models, the loglinear model is stochastic rather than deterministic* 
Whereas Dayton and MacReady's (1976) method applies only to tables having exactly 
tX70 levels along each way, the present approach can be applied to tables having 
any number of levels along each way. Furthennore, the loglinear analysis is 
quite rich. It provides a basis for comparing hypothesized sequences; it pro- 
vides tests of fit for each separate model; it provides estimates of the pro- 
portion falling at each step along the sequence; and it provides estimates of 
the proportion who fail to conform to the hypothesized sequence. 

On the negative side, the assumptions of the model are strong, particularly 
if the restricted fom of Equation 1 is used. Because the analysis relies on 
chi square goodness-of-f it statistics, it suffers from the problems associated 
with such statistics. If the degrees of freedom are small, then the statistical 
test has low power. Some cells may need to be collapsed if their frequencies 
are too small . 

There are two problems which will, we suspect, complicate the study of 
developmental sequences via the loglinear or any other method cited above. 
First, the admissible cells generated by two sequences can differ by as little 
as one cell. When the choice of sequence depends so heavily on such a small 
portion of the data, large sample sizes will be needed to reliably distinguish 
between the sequences. When comparing sequences generating highly similar ad- 
missible setSp the sequence favored may vary inconsistently from one study to 
the next. 

Second, whether a step in an hypothesized sequence is needed to describe 
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responses will depend, in part, on the developmental or instructional level 
of persons studied. If the subjects are not advanced, then the highest ster)S 
in a sequence may not be needed to account for the data simply because no 
subjects have reached those steps. Similarly, in an advanced group, the lowest 
steps may not be needed. Consequently, researchers investigating the same 
hypothesized sequence in similar populations, but at different points of 
instruction, may arrive at quite different conclusions, even if that sequence 
provides a useful description of learning in both groups. If sequences con- 
tain quite transitoiT^ steps, then at any given time, few people would be 
found at that step. Consistent evidence for the transitory step would be 
difficult to obtain. 



EKLC 



Loglincar 
14 



Footnote 

'''If the unrestricted form of the model is applied, then the number of 
degrees of freedom for the likelihood ratio difference statistic equals the 
number of admissible cells. 
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TABLE 1 



Bivariate Frequency Distribution Between 
Ordination and Number Conservation 



Ordination Number Conservation Stage 
Stage 

I II III 

I 16 3 1 

II . 15 3 3 

III 23 4 27 
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TABLE 2 



Model Parameters 



Sequence 1^1.^^1.21.31415 2L(0i) l^^n^ -^-III^ -^-I^ -^-11^ - -III^ 

A .51 .13 .11 .03 .03 .20 .15 .12 .73 .65 .16 .19 

B .59 .13 .01 .03 .04 .20 .11 .33 .56 .73 .05 .22 



ERIC 



21 



Loglinear 
20 



Figure Caption 

Figure 1. Admissible and inadmissible response patterns for four 
developmental sequences. Hatched cells are inadmissible. Numbered cells 
are admissible* 
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